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I. INTRODUCTION 


A. THESIS CONTENT AND ORGANIZATION 

This thesis investigates the use of single-source shortest path algorithms in two 
separate contexts. The first application, found in Chapter II. 1s embedded within a J-S. 
Organization of the Joint Chiefs of Staff (OJCS) planning program called State of the 
Art Contingency Analysis (SOTACA). SOTACA is an interactive, automated tool that 
assists staff planners and operations officers in quickly analvzing alternate plans for a 
contingency operation [Ref. 1: page 1-2]. SOTACA frequently computes shortest paths 
Which are used in the routing of friendly forces over a network that represents the area 
of operation. The shortest path software currently contained in SOTACA 1s deficient 
in two respects: 

(pee itis too slow, and 

(2) it ignores alternate shortest paths when thev exist. 
Chapter II describes an implementation of single-source shortest path algorithms which 
resolves these two problems. 

The second half of this thesis considers shortest path calculations embedded 
Within vehicle routing problems over large networks. Whenever two nodes are visited 
consecutively by the same vehicle, the shortest path connecting those nodes must be 
used. However. only a minute fraction of all possible pairs of nodes will ever be 
all 


possible shortest paths. For this reason. Chapter III develops an effective wav of 


Cc 


considered for consecutive visitation. Therefore, it is desirable to avoid computing 


providing the shortest path information needed for vehicle routing without solving a 


large number of shortest path problems. 


B. NETWORK FLOW MODEL STRUCTURE AND NOTATION 
For the purpose of the discussions to follow, the network will be considered to be 


a directed graph G=(N.A) consisting of a set of nodes, N, 
ee lea 1) 4 
and a set of arcs (ordered pairs of nodes), 


Bo ISN RN, 


- 


where x denotes the Cartesian product. Nodes represent places (or items) of interest to 
the modeler while an arc defines the existence of a valid route (or relationship) between 
nodes. Associated with each are 1s a nonnegative flow parameter, C(ij), which is the 


cost assessed per unit of flow across the arc (1,J). 
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I]. GLOBAL CONTINGENCY PLANNING AND NETWORKS 


A. CONTINGENCY PLANNING WITH NETWORKS 

The Organization of the Joint Chiefs of Staff (OJCS) uses network flow models 
in the conduct of global contingency planning. This is accomplished with the 
assistance of a modern-aid-to-planning-program (MM APP) called the State of the Art 
Contingency Analvsis (SOTACA) model. One of SOTACA’s principal functions 
provides for the representation of the area of operation in its various dimensions (1e.. 
land. sea, econonuc, political, . . .) enabling the commander and his staff to 
systematically analyze the nussion and situation of the assigned task [Ref. 1: page 2-7]. 
This representation of the operational area takes the form of a network flow model 
Bepetre modes represent places of significance and arcs between nodes represent 
movement paths for forces. 

The primary function of SOTACA’s network flow model is to enable the studv 
and analysis of candidate routes (1.e.. shortest paths) for movement of enemy and 
friendly forces on an administrative or tactical march. Associated with each are in the 
network are two separate flow costs: 

e the physical length in kilometers of the arc 
e the time in nunutes to traverse the are 
These two flow costs enable staff planners to study both time and distance in 


contingency planning. 


B. PROBLEM DEFINITION 
1. Background 

In 1985, SOTACA was delivered to the J-8 OJCS by Science Applications 
International Corporation and shortly thereafter. follow-on documentation was 
initiated. This included an analvst’s guide to using SOTACA with emphasis on the the 
network flow model and shortest path computations. Concurrent to the follow-on 
documentation, J-8 planners were trained to use SOTACA and started to analvze 
contingency plans. The subsequent use of SOTACA resulted in problems where model 
execution time increased alarmingly when the network approached the maximum 
lawaclemsizesor 300 modes and 1250 arcs [Ref. 1]. (This is considered small by 


contemporary standards [Ref. 2.3].) 


1] 


During May-June 1986, the author was assigned to the J-S OJCS as part of 
his Naval Postgraduate School Operations Analysis experience tour. He was tasked by 
J-S with deternuning: 

(1) the shortest path methodology implemented in SOTACA, and 
(2) if this methodology was responsible for the increased execution times. 

The author's analvsis deternuned that SOTACA uses an implementation of 
Flovd's all-pairs shortest path algorithm [Ref. 4: page 210]. Subsequent discussion 
between J-S analvsts and the author brought to light that only a small portion of the 
all-pairs solution is ever used by SOTACA. The results of further analvsis concluded 
that the slow execution of the SOTACA model is rooted in both the network data 
structure supporting the implementation of Floyd's algorithm and in_ the 
overabundance of information it produces. 

A secondary issue which surfaced during this analvsis was a _ perceived 
shortconung of SOTACA’s shortest path implementation, namely, the nonrecognition 
of alternate shortest paths when thev exist. J-8 analysts felt that the identification and 
use of alternate paths could enhance the analvsis of a contingency plan via SOTACA. 

2. The Issues 

The remainder of this chapter addresses two specific issues which were still 
unresolved at the conclusion of the authors experience tour. The first is to deren 
what can be done to reduce or elinnnate SOTACA’s slow execution time. The second 
is to identifv a methodology for generating alternate shortest paths for use by 
SOTACA. 


C. SOTACA IMNPLEMENTATION OF FLOYD’S ALGORITHM 
This section provides a brief description of the shortest path methodology and 
the associated data structure currently used by SOTACA. 
I. Floyd’s Algorithm 
Floyd's algorithm produces an all-pairs shortest path solution by examuning 
every path between two nodes and recording that path which has smallest total flow 
cost. This process is repeated for every pair of nodes in the network. Figure 2.1] 
outlines the SOTACA tmplementation of Flovd’s algorithm. 
Underlying SOTACA’s use of Flovd’s algorithm is a data structure which 
contains a description of the network and provides for recording of the computed 


shortest paths. 


" yA variable. NODES, which specifies the total number of nodes in the 
pence 
eieiem On aesenmouon in the form of related vectors FROM-NODE. 
O-NODE., and FLOW-COST. 
Output: 
(1) The PATH function. 
(2) The PATH-COST function. 
1. Initialization 
a. PATH(1.j) = -1 , for alliandyeN 
b. PATH-COST(1j) = 4% . for alliandjeN 
Garor k= 1. |A|Do: 
(1) PATH(FROM-NODE(1).TO-NODE(1)) = 0 
(2) PATH-COST(FROM-NODE(1), TO-NODE(1)) = FLOW-COST(1) 
End do 
daePDATE-PBAG ="ON 


2. Enumeration 
While LUPDATE-FLAG = ON, do 
ULPDATE-FLAG = OFF 
One t Oe OlieS.do 
On | =) toe OIE Seda 
ror k — jsto NODES, do 
lee ree Osta ta ltH-COSi(ky) <= PATH-COST(1,)) then 
PATH-COST1(1,)) = PATH-COST(i,k) + PATH-COST(k,j) 
PATH(iy) = k 
ED ie GaN 
Endif 
End do 
End do 
Encudo 
End while 


Figure 2.1 SOTACA’s Implementation of Flovd’s Algorithm. 


2. Input Arrays (Network Representation) 

The network description 1s contained in four arc-length (denoted |A]) arrays. 
The entries in these arrays are related bv position and define an arc in the network. 
These arravs identifv the origin of an arc (FROM-NODE), the destination of an arc 
(TO-NODE), the distance flow cost (DISTANCE), and the traversal’ tinie ‘loumeas 
(TIME). Outside of these positional relations, the are information is in random order 
in the arrays. That is, the first arc input by the user is placed in the first position of 
the arravs, the second arc in the second position and so on. 

3. Output Arrays (Shortest Path Representation) 

The shortest path solutions are contained in two pairs (1.e.. four total) of n by 
n matrices (Where n = |N]). Each solution pair, while utilizing the same network, is 
for a separate flow problem. The first is concerned with the physical distance between 
nodes, while the second involves traversal-tinme between nodes. 

Each solution pair utilizes the same method for storing shortest path 
solutions. Thus, a solution consists of a PATH-COST function which identifies the 
shortest path distance (i.e., the total flow cost) between any two nodes in the network. 
and a PATH function [Ref. 4: page 211] which specifies the sequence of nodes on each 
Shortest path. 

The PATH-COST function is an all-pairs version of the well-known label 
function {Ref 2.5: page 15}. The PATH function is also wellknown and Fictiremee 


depicts the iterative process for recovering shortest paths from it. 


PATH(j) = -1 No path exists from node 1 to node j 


Q Node j is connected to node i bv are (1,)). 


k To reach node j from node i. first goto node k and. 
then exanune PATH(K.)) to determine wintehmiode ss mens 
on the path. 





Figure 2.2 SOTACA PAT Merinmecen 


Armed with an understanding of how SOTACA currently produces and 
records shortest path informiation, attention is now turned to methods by which the 


network module deficiencies can be elinunated. 
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D. SINGLE-SOURCE SHORTEST PATH ALGORITHMS 

SOTACA computes all-pairs shortest paths in a preprocessing method which 
typically requires a twenty nunute waiting time (on a MICRO VAN) for networks at or 
near the maxinium allowable size. As has been stated previously. SOTACA is able to 
perform its force routing function if it has at a munimum, the shortest path solution 
inemiea speciiied mode to all other nodes in the network. In the hterature available 
todav. this type of problem 1s often referred to as a single-source shortest path problem 
(Ref. 4: page 203]. It 1s proposed that SOTACA use a single-source shortest path 
algorithm to compute shortest paths on-demand (1.e.. as needed bv the user). To this 
end, this section presents an exanunation of two well known single-source shortest path 
algorithms. These two algorithms are label setting and label correcting, and there are 
numerous variations of each. 

Prior to presenting the algorithms. two changes to the SOTACA data structure 
are introduced. These modifications support the functioning of the single-source 
shortest path algorithms as well as provide a means to more efficiently represent 
network data and shortest path solutions in SOTACA. 

1. Modifying SOTACA’S Network Data Structure 

Two changes to SOTACA’s data structure are proposed. The first concerns 
the manner in which the network description 1s represented internally, while the second 
involves the method for storing the shortest path solutions. 

a. Reorganizing the Network Data 

The network representation SOTACA uses is rather cumbersome when it 
comes to locating specific arc information. In locating the set of all arcs which 
Seinate fron node 1 {this set 1s called the fomvard star of i [Ref. 3: page 218] ). 
SOTACA conducts an arc-length search of the FROM-NODE array. This inefficiency 
is part of the larger problem that has surfaced with symptoms of slow execution. 

This inefficiency can be overcome in two steps. First of all, the network 
data in the original arravs are put through a one-time sort which places the data in 
ascending order based upon the FROM-NODE field. As a result, the forward star of 
any node is in contiguous space within the arrays. Sequencing the arcs of the network 
by forward star vields efficiencies in solving for shortest paths. 

At the completion of the one-time sort. the FROM-NODE arrav 1s used to 
construct an arrav known as the TAIL array [Ref. 2,5: page 13} which then replaces 
ther PROME-E NODE atray. The [TAIL array is of length |N|+1. TAIL(i) specifies the 


— 
tm 


initial position of the contiguous space contaming all arc mformation for ares 
originating at node 1, While TAIL(Q+1)-1 specilies the last position. Figure 2.3 shows 
an cxample of network data being transformed from the original network 


representation to the forward star format. 


dS 2 
De rol ia 
3 4 1 


(a) SOTACA Network Representation 


iv a 
2 4 
om 


(b) After the one-time sort 


LT ROM-NODE 
bO- SOIT 
LT LOW-COST 


TALL > 


6 
Oe OD een 1A 
ELOW-COST Po 


(c) Forward Star Network Representation 





Figure 2.3 Transformation of Network Data to Forward Star Format. 


There are two advantages in using this modified data structure. The [irst ts 
the memory savings that occurs when the arc-length vector FROM-NODE ts replaced 
by the node-length vector TAIL (as most) networks are such that |N| << je 
Ifowever, the second advantage far outweighs anv other, as the accessing of specific 
arc information has been transformed from an arc-length search to an exanunation of 


only those arcs of present interest. 
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b. The Predecessor and Label Funetions 

The reader will recall that SOTACA stores an all-pairs shortest path 
solution in the two n bv n matrices, PATH and PATH-COST. Solutions to the single- 
source shortest path problems can be compactly recorded in two 1 bv n arrays called 
Myemrete ceessor iiiiction {Rel 2: page 10) and the label function. The predecessor 
function, P(). is associated with a single source node (1.e., the root node) and contains a 
imee OF Shortest patlis. The predecessor function differs fron1 PATH in that it specifies 
the backpath from a node to the root node with each entry tndicating which node was 
visited (conung from the root node) immediately prior to the current node. PATH, on 
the other hand, specifies a forward sequence of nodes for traversing the shortest path 
from one node to another. Figure 2.4 depicts the predecessor function. 

The label function. U(), contains the total flow cost to reach a node from 
the specified root. Thus, it is the one row of the original all-pairs PATH-COST 


function associated with the root node r. 


Pa) = 4j Node j immediately precedes node 1 on the backapth 
to the root node. 


0 Indicates that 11s the root node if U(i) = O. Otherwise 
indicates that 1 cannot be reached from the root node. 


Procure 2a lite Predecessor Funetior. 


2. Label Setting Algorithms 

One method of solving the single-source shortest path problem 1s to use a 
label setting algorithm [Ref. 2.3.4.5]. In this method, also known as Dykstra’s 
algorithm [Ref. 4: page 204], the nodes are partitioned into two sets, labeled and 
unlabeled. Labeled nodes are those for which the shortest path from the source 1s 
known, and unlabeled nodes are those for which it is not Known. At each iteration, the 
method identifies the cheapest unlabeled node which can be reached from a labeled 
node, and adds this node to the labeled set. It should be noted that label setting (in 
contrast to label correcting) requires: 

Caj) 2 0 for all (1,j)) € A 


which was stated as an assumption in Chapter I. 
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The label setting algorithm, as with other single-source shortest path 
algorithms, generates a tree T consisting of a single root node r with other nodes 
connected to that root by some shortest path. Associated with each node i in the tree 
is a label that specifies the cost of the shortest path originating at the root node and 
ending at node 1. 

In the frrst iteration, the root node ris the onlv labeled node and it has a label 
of zero. At each tteration. the algorithm examines all the arcs originating at labeled 
nodes and identifies the unlabeled node (and the associated arc) which is cheapest to 
reach next. That unlabeled node is labeled and the associated arc added to the set of 
shortest paths, Ay. The algorithm repeats this process |N|-1 times since at each 
iteration one node 1s labeled. Frgure 2.5 provides a step by step description of the label 
setting algorithin. 

3. Improving the Label Setting Algorithm 

It does not take much fanuharization with the label setting algorithm to see 
that there is at least one major inefficiencv with it. At each iteration but the first, the 
algorithm exanunes many arcs which it has exanuned previously, and some which have 
no bearing on producing new shortest paths (e.g.. those arcs between labeled nodes). 
Thus, in a network of JA] arcs, the algorithm ends up examining more than |A| arcs. 
To avoid this extra work. the algorithm can be modified so that each arc 1s examined 
Only Oncem is cies] 

This rs accomplished by setting temporary labels, which are tmtermediate 
guesses at the final (permanent) values of the labels. The algorithim proceeds similar to 
the basic algorithm. It starts with a specified root node and exanunes its forward star. 
setting all labels which can be improved upon and designating these labels as 
temporary. There are numerous wavs to store temporary labels. The method chosen 
in this thesis was to use the sign bit of the predecessor function [Ref. 5]. Thus during 


anv tteration, the sign of the predecessor function indicates the following: 


PO) = O indicates the label for node 1 has not been set 
P(i) < O indicates the label for node 1 is temporary 


P(1) > O indicates the label for node i is permanent 


After the forward star has been exanuned and all temporarv labels set, that 
node with the minimum temporary label is identified. This node’s label is set 
permanent, and its forward star is exanuned thus repeating the process. The algorithm 


stops when all temporary labels have been set permanent. 


IS 


Input: 
(1) A directed graph G=(N.A) where C(1y) 2 0 for alliandjeN. 
(2) A specified root node r. 

Output: 
(1) The shortest path costs in the label function U(). 


(2) The shortest path tree in the predecessor function P(). 


ielminialize a tree T(N7Aq) such that: 


Me elf 

eae | « 

emery = ~~ for allt © \-N7- 
d. U(r) = 


0 
e. P(t) = Oforallte N 


2. Examine the forward star of all permanently labelled nodes and define: 
em he Ne N-N-, (1) e A} 
IF S = (| } THEN proceed directly to step 4. 
Peeonacteniimine wiles mest mext label and its associated node, examine each 
element of S and: 
PeeinGeh|) — aramim , Ci) + Cij):G))EeS } 
b. Redefine: 
Nemewe: | 5 
aS uiesie nal) } 
CCl 
Pd) =k 
ay = Ui Ck) 
GEE Dedl StCD Ue, 


4. Stop. 


Ca 


[Ref. 
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Figure 2.5 The Label Setting Algorithm. 
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The basic label setting algorithm, presented in Figure 2.5, is easilv modified to 
utilize this unprovement. To do this, steps 2 and 3 of the algorithm are replaced by 


those shown in Figure 2.6. 


To alter the basic algorithm, replace steps 2 and 3 in entirety by the 
following steps: 
2. Examine the forward star of node r: 
IPC) 2 eee ery 
THE Nicer 
Ui =) Ee iy 
P(j) = -r 
EXD: 


[F Pi) > Oforallieé N 

TB Nsprocecdira stapes 

Ele set 
r= aremin | UG) Pa) Comes 
P(r) = -P(r) 


Repeat step 2 
Eo 





Figure 2.6 Label Setting Improved by Use of Temporary Labels. 


Label setting 1s just one single-source shortest path methodology. Attention 1s 

now turned to a related vet different single-source shortest path algorithm. 
4. Label Correcting Algorithms 

Another method of solving single-source shortest path problems is to use a 
label correcting algorithm [Ref. 3.4.5.6]. As with the label setting algorithm, label 
correcting generates a tree of nodes connected by shortest paths, and associates a label 
with each node in the tree. This node label is identical to that used in label setting. 

Although not relevant in the present applications, it 1s worth noting that the 
label correcting algorithm can handle negative arc flow costs. There is one restriction, 


however: no cycle in the network can have negative total cost. 


Label correcting sets temporary labels as it proceeds and upon reaching a 
specific ending condition declares all labels as permanent. To accomplish this. label 
correcting uses a list which contains nodes whose labels have been modified (1.e., 
corrected). Initially, this list contains only the root node r. 

The list 1s processed in a last-in first-out (LIFO) fashion. When node 1 is 
stripped off of the list, its forward star is examined. If there exists a node j such that: 

pe Ey) Ci). 
then the label of node j is corrected by, 

Ce = yee C (1.3) 
and its predecessor function 1s updated. 

Rea: 
In addition, a node whose label has been corrected is added to the list if not alreadv 
appearing on it. The list 1s processed until there are no nodes left on it. At that time 
the algorithm stops and all labels are declared permanent. Figure 2.7 depicts the label 
correcting algorithm. [Ref. 3.5] 
5. Improving the Basic Label Correcting Algorithm 

There are numerous ways to improve on the basic label correcting algorithm. 
Most improvenients specify a different manner in which to process the list of corrected 
nodes. One such method, termed scan-eligible {Ref. 6: page 67], uses a partitioning of 
iemcotrected modes into two lists. NOW and NENT. The nodes on NOW are 
processed in a LIFO fashion exactly as had been the basic algorithm's list. However, 
eamectcd nodes are placed onto the NENT list. Then when all the nodes on NOW 
Peapcebcen processed, NOW is set equal to NENT, and NENT is set to an empty set. 
This process is repeated until both NOW and NEXT are empty sets. The resulting 
labels are permanent at that time and the predecessor function contains the shortest 
path tree. Figure 2.8 depicts the label correcting algorithm improved through the use 


of scan-eligible lists. 


FE. MULTIPLE SHORTEST PATHS 
The final SOTACA-related problem to be addressed concerns the existence of 
multiple shortest paths in a network. When multiple shortest paths exist, it 1s desired 
that the following occur: 
e that a shortest path solution be produced, and 
e that the pateiols pats between a user-Specified sink node and the root node be. 


enumerated a ieeicmmecivest. Vote tilat In Some networks the number of 
alternate shortest paths between a root and sink node mav be verv large. In 


Input: 
(1) A directed graph G=(N.A) with unbounded arc flow costs C(1,j). 
(2) A specified root node r. 
Output: 
(1) The shortest path costs in the label function U(). 
(2) The shortest path tree in the predecessor function P{). 
1. Initialize a tree TN pA) such that: 
A Nap ela 
DNae ae 
c. Ut) = > for aiiee = 
d3C Geo 
e, Pith =-0 [or all ee 
Lb Sag ee 
2. IF LIST-=4 | THE NBproccedidinectinstomctcmr 
PESE denne: 
1 = node at the top of LIST 
LIS T=) &lsiee 1 
END 
3. Examine the forward star of node i and for each node j € N where: 
CO eae 


Redefine: 
= ar U t | } 
Ap = (Aq- ((si)e Ap} } U ((i)} 
P(j) =1 


Lj) = &G) + Cj) 
DIST =) is Tae 
Repeat step 2. 
AS LOD: 


[Ref. 3.5] 


Figure 2.7. The Label Correcting Algorithm. 
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Input: 


A directed graph G=(N.A) with unbounded arc flow costs C(i.}). 


| 
5} A specified root node r. 


Output: 
ih The shortest path costs in the label function. 
Eile shonestpalumcnetne predecessor function. 


eel tiitiaiZe a tree ee such that: 


E — fry) 
aap tt 
ae 
t oJ 


Dea 
SO for an NON, 
d. U(r) = 0 


e. P(t) = Oforallte N 
MmevOny — rr! NEXT = |} 


. Define: 


tJ 


i = node at top of NOW 
Wwony = NOwW =. / 1} 
3. Examine the forward star of 1 and for each ] € N where: 
ees Ce (i) 
Redefine: 
a = 
eee pee Siena) } OU ia(1e]) } 
P(j) = i 
UG) = CG) + CG.j) 
ReXT = NEXT U {}} 
Seley =; |HEN repeat step 2. 
PNW = | and NENT = { | THEN goto step 5. 
Otherwise set: 
NOW = NEXT 
Neel = 5 
Repeaustep 2 
ee OOP. 


[Ref. 3.5.6] 


Figure 2.8 Iniproved Label Correcting Using Scan-Eligible Lists. 
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this case, it mav be desirable to enumerate only a subset (size specified bv the 
user) of all these alternate paths. 


None of the algorithms discussed thus far are designed to produce a solution of 
this type. As designed, each algorithm merely provides the first shortest path solution 
encountered. ignoring any alternate shortest paths. This section describes some 
methods for attaining recognition of multiple paths and enumerating the alternate 
paths upon user request. 

1. Determining the Existence of Multiple Shortest Paths 

The first thing to be done 1s to find a means by which it can be determined 
that multiple shortest paths exist in a network. This is readily accomplished in the 
basic label setting algorithm through the setting of a flag to indicate that multiple 
shortest paths exist. This flag is set during the examination of arcs in the forward star 
of a labeled node. There are three conditions which indicate that multiple shortest 
paths exist and these are shown in Figure 2.9 - Note that these conditions) eameee 


looked for as the algorithm builds the shortest path tree. 


When examining the forward star of node 1: 
Se 
eas 
P(j) 
LG) = UG) + Cay) 
ies 


= 1am 


Set the Multiple Solution Flag 
EN DE 





Figure 2.9 Setting the Multiple Optimal Solution Flag. 


The setting of a flag can also be done in the other three algorithms. However, 
the nature of improved label setting and both label correcting algorithms require an 
entire re-examination of all the arcs in the network after the shortest path tree has 
been built. If the conditions described in Figure 2.9 exist, then the multiple solution 


flag is set. 


2. Enumerating the Multiple Shortest Paths 

There are (at least) two methods by which multiple shortest paths from the 
root to a specified sink node can be identified and presented to the user: depth-first 
search [Ref. 7: page 91]. and breadth-first search [Ref. 7: page 95]. For this thesis. the 
depth-first search technique is used. 

The depth-first search 1s an optinustic approach which considers one path as 
good as anv other during the search. After the designation of the root and sink node, 
a depth-first search starts at the root node and builds a tree as it searches to reach the 
sink node. Essentially, it adds nodes in a sequential fashion building a tree consisting 
of a main trunk with no branches. Thus, the search proceeds deeper and deeper (..e.. 
further away from the root node) until no more nodes can be added or when there is 
no hope of reaching the sink. At that point, the search backs up the tree one level and 
exanunes other alternatives that have not been searched at that level. If alternatives 
exist, the search goes back to its headlong dash down the new trunk. I[f no alternatives 
exist, the search will backup another level and check for alternatives there. The search 
stops when the sink node is found or when the backing up reaches the root node with 
no alternatives left unsearched. 

Note that having a shortest path solution before enumerating the alternate 
shortest paths provides the known optimal distance to reach the sink. s. Thus, the 
depth-first search can be interrupted and directed to other paths once the current trunk 
length exceeds U(s) as there is no hope of reaching s optimally at that point. Figure 


2.10 provides a description of the depth-first search algorithm. 


F. TESTING AND EVALUATION 

All six algorithms described in Section E (ie., label setting. improved label 
setting, label correcting, improved label correcting, modified for multiple solutions label 
setting. and depth-first search) were successfully implemented in FORTRAN for 
execution on the Naval Postgraduate School's [B\{-3033. 

Testing and evaluation consisted of three specific stages. The first test was a 
minor one which simply verified the functioning and output of the single-source 
shortest path algorithms. The second test provided for a comparison of execution 
times for each of these algorithms against sample networks. The third test involved 
verifving the function and output of the modified for multiple solutions label setting 


and the depth-first search algorithms. 


tJ 
Cay 


Input: 
(1) The root node 
>) The sink node s. 
134 VINER TH 
Output: 


(1) The shortest paths from r tos, or an indication what reand’s ate mee 
connected via alternate paths. 


lL. CES =e) 
2. IF LIST = %\ } PHEN corinecih ess tcjioms 
3. Processing LIST inva ELFO tashionedetime: 
a. 1 = node at the top of LIST 
b: IF UG) > MAXDEP Tal Tries. 


Remove 1 [toma mebou 
Repedm stew: 


c 1B 1 = ss 3iEsN 


Announce success 
Proceed directly to step 4 


d. Scan the forward star of node i for a successor node v: 
IF there are no eligible successors THEN 


Remove i from LIST 
Repedtestem 2 


ELSE 


Add v to LIST . 
Designate the associated arc as examined 


Set 1 = Vv 
Repeat step 2 
ENE 


4. IF success has been announced THEN 
Record the palniiromipmtars 
Remove s from the top of LIST 
Repedtestepe 
ELSE 
Proceed torstep: 5 
ENDE: 
Smo 
[Ref. 7] 


Figure 2.10 Depth-first Search Algorithm. 
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The sample networks used to test the algorithms were generated by the random 
network generator NETGEN [Ref. 8] on the IBM-5033. Each network consisted of a 
set of nodes and a set of directed arcs with their associated arc flow costs. 

1. Label Setting and Label Correcting 

a. Test for Algorithm Functioning 
This test was designed to determine if the algorithms were properly 
implemented by verifving the contents of the predecessor and label functions. To 
accomplish this, NETGEN was used to generate small networks of 10 nodes and 30 
arcs. Each algorithm was run using these networks and the resulting predecessor and 
label functions output for root nodes | through 10. The results were compared to the 
own resulis tO verify the output accuracy. This test demonstrated that the 
algorithms functioned properly, and produced accurate predecessor and label functions. 
b. Test for Algorithm Comparison 
This test was designed to provide data in the form of algorithm execution 
times for a set of sample network problems. The times were compared to give an 
indication of the relative speed of each algorithm. 
The test was designed as follows: 


e The algorithms were standardized so that execution times measured the same 
set of tasks in each algorithm. 


e NETGEN was used to generate 3 networks of the following sizes: 
Network 1: 300 nodes and 1230 arcs 
Nerwork 2-200 modes and [iv arcs 
Network 3: 100 nodes and 1000 arcs 
e Each algorithm was run using the same set of 10 root nodes (one run per root 
Crd eg with ne execution times recorded internally (using FORTRAN 
ME and SETIME functions). The root nodes were generated using the 

RTRAN pseudo-random number generator LRND. 

The results of the test were summarized by noting the minimum, maximum, and mean 
ExecitioMmetimies. | lhese results are presented in Table | and provide a general 
indication of the speed of each algorithm. 

The algorithms use a variety of variables and arravs to support the 
production of shortest paths. However, the bulk of the data structure in each 
algorithm 1s dedicated two functions: 

e storing network data 
¢ storing shortest path solutions 
Disregarding the non-array variables, the approximate size of the data structure size for 


each algorithm, as well as that for Flovd’s algorithm, is presented in Table 2. 


TANG 


l 
EXECU TIO. Tvl sae BEJASE LiL Gea See Oc ealie Gs 


R LAB 
(IN CPUSSECONES 


Problem Size 
. 100 nodes 200 nodes 
Algorithm 1000 arcs | 2 seanes 
Label Setting 
Average 


Vfininium 
VIaximum 





Improved Label 


Setting 
Average 0.0 [0 0.066 0.150 
Viinuinun 0.003 0.059 0.143 
\Mlaximun) 0.026 0.083 0.183 


Label Correcting 


Average 
Viininium 
Maximum 


Improved Label 
Correcting 


Average 
\Vfinunum 
\faximuni 





2. Depth-First Search For Multiple Paths 
This test was primarily concerned with insuring that the multiple solution flag 
Was set properly in the modified label setting algorithm and that the depth-first search 
algorithm produced the correct alternate paths for a given root and sink node. 
The test was designed as follows: 
e 2 networks of 10 nodes and 20 arcs were constructed. Network | was prepared 
such that onlv one_ shortest path existed between each pair of nodes. 


Conversely, network 2 was designed to have multiple shortest paths between 
some nodes. The shortest path solutions for both networks were Known. 


eB Es. 
PeaOmeurniwOAtTA STRUCTURE SIZE 


Label Papel 
Setting Correcting”™ 


Function 


Storing . ‘ 
Network Dalen Zl S| 
Data 


Shortest ; 
Path IN| 
Generation 


Shortest 


eae 2 NIN 
Solutions 


Total Pee IN 2| AlestiN| 3]A| + 2INIJN| 


* The scan-ehgible label correcting algorithm uses an additional (1.e., two 
total) |N|-length list for shortest path generation. 





e The modified label setting algorithm was_run with each node in the network 
specified as the root nodé for one run. The depth-first search algorithm was 
foes alter a shortest path solution had been produced if the multiple solution 

ag Was Set. 


e The output of the depth-first search algorithm was compared to the known 
alternate shortest paths in the network. [his comparison focused on accuracy 
of each alternate path produced, as well as the completeness of the solution 
(with regards to the known quantity of alternate paths to be found). 

Both the modification for muttple solutions and the depth-first search 
algorithm functioned properly. The multiple solution flag was set appropriately, and 


the depth-first search algorithm: then produced the correct alternate paths. 


I. REFERENCE NODE AGGREGATION 


A. INTRODUCTION 

The second part of this thesis involves questions which arose in connection with 
work by Professor Rosenthal on a vehicle routing algorithm. While not directly related 
to the SOTACA problem, the foundations are much the same. One main difference is 
that the networks of interest are large scale. 

Despite this vast increase in problem size over SOTACA networks. the item(s) of 


Interest is the same, namely the shortest paths between nodes. 


B. PROBLEM DEFINITION 
For consistency purposes, the same basic network terminology used in the 
SOTACA chapter will be used to formally describe the problem at hand. 
1. Assumptions and Given Data 
Let G=(N.A.R) be a large-scale directed graph of |N| nodes and |A| arcs. 
The size of G is not fixed, but in practice it is expected that |N] 1s large (e7s-) SU0/0G0ma, 


more nodes), while |A| is approximately bounded as follows: 
L739 N=" Ae 


Each arc of A has a non-negative flow cost, C(1,}). 

The graph G forms the basis upon which the vehicle routing model performs 
its function of routing vehicles from one location (node 1) to another (node j) at 
nunimum cost. With no restrictions placed upon i or j, all nodes in N are potential 
Origins and destinations for the model. However, it is recognized that only a very small 
fraction of all the shortest paths in G will ever be used. So, rather than wasting 
considerable time to produce an all-pairs shortest path solution which cannot be stored 
with available computing machinery anyway, it is desired that the shortest path 
algorithm produce a part of the all-pairs solution (i.e., one small enough to be stored 
internallv) in which some shortest paths are Known and from which all others can be 
quickly approximated. With current technology, it is assumed that several |N|-length 
arravs can be stored internally but not |N{ of them. 

To this end, a set of nodes is designated as a reference set) This sctjyan 


(r= |RI]), 1s a subset of N and in practice it is expected that: 


f= ae! (e.2., r — .000T|N)). 


This designation is a result of a partitioning of all the nodes of N into r clusters where 
EacWecliister COmiains one reierence node. All of the remaining nodes in a cluster are 
considered ordinary nodes. 

This designation of R 1s to be used bv the shortest path algorithm to produce 
a subset of the all-pairs solution, namely the all-pairs of R solution. From this all- 
pairs of R solution. the vehicle routing model must be able to determine the shortest 
path between any nodes in the network. Thus, the all-pairs of R solution must be 
Structured such that all nodes in the network are a known distance from at least one of 
the reference nodes. To facilitate this. the shortest path algorithm shall utilize an 
engineering parameter approach which provides the user a degree of control over the 
amount of approximation used. 

The engineering parameter approach is defined as follows. Letting SP; 
represent the optimal shortest path cost from reference root node 1 to node j, and EP], 
Piero PS the engineering parameters, with EP! < EP2 < EPS. the proposed rules 
are as follows: 

e if SP; Se rameenem tiie Aleeriiineis neduired to produce the shortest path 
accurately 

= i SP; is known and satisfies EPI < SP; < EP2, then the algorithm is allowed 
to approximate SP by SP i 

oe it SP; = EP2, then the algorithm may neglect computing the shortest path, 
approxiniate SP; Peel omaiteorces ie algortinn to a halt {1.e.. stop 


computing shortest paths) 


iicmiicterulestequires that all shortest paths ef length EP1 or less be 
computed accurately. That is, the optimal shortest path must be located and the node 
labeled appropriately if the node is to be labeled at all. In essence, this provides the 
means by which the user can ensure that the shortest path from a reference node to 
each of the ordinary nodes in the same cluster is accurately computed. Applied to each 
cluster, this ensures that all nodes in N are a known distance from at least one of the 
reference nodes. 

The second rule, essentially provides for assumed svnimetry between reference 
root node i and node j. With respect to the vehicle routing problem, this rule 1s 


designed to allow the algorithm to ignore the asymmetry of a particular route on trips 


of specified length. For example, on trips between location 1 in city | and location j in 
city 2, the one-way on-ramp to an interstate can be ignored as its distance is 
insignificant to the total shortest path distance between 1 andj. Thus to save work and 
EXeECuliOn time, SP iS approximated by SP ii . In contrast, this rule insures that 
symmetrv is not assumed when the path 1s less than EPI, Consider the case wie 
node 1 and node j are different locations in the same business district in a city. To 
ignore one-wa¥ Streets in this setting may produce a gross inaccuracy in the computed 
SP i; Thus, the selection of EP! and EP2 enable the user to determine under what 
conditions svnimetrv may be assumed. 

The third rule, defines the maximium distance from the reference root node(s) 
that the user wants exanuned. When the algorithm encounters the first shortest path 
length greater than EP2, the associated node is labeled with a dummy distance (EP3) 
and the algorithm stops. All nodes, to include the non-root reference nodes, that are 
outside this maximum range are not labelled. Leaving these nodes unlabeled is 
acceptable since in vehicle routing these nodes will never be visited consecutively and 
thus it 1s not required to know SP; fomriem: 

These rules essentially allow the user to adjust the scope of the shortest path 
problem according to individual desires or needs, as Well as providing for flexibility to 
take advantage of advances in computer hardware as improvements are introduced. 

2. SPii Approximations 

The shortest path solutions produced are for all-pairs of R. That 1s, each 
node in R is designated as the root node once, and this results in 1 single-soume. 
shortest path solutions. From these r solutions, any SP i; for G can be computed in 
one step. The user designates the i and j of interest, and the model knows the reference 
node that each ts associated with. Letting I represent the reference node associated 
with node 1, and J represent the reference node associated with node j;) tienen 


computation of anv SPi; is as follows: 


— wth: sty: yy 


where p and q are weights designated by the user for adjusting this approximation. 
3. The Problem 
The problem to be addressed is two-fold. The first task is to determine what 
tvpe of shortest path algorithm 1s most appropriate to this situation and will function 


as the base for construction of the reference node aggregation algorithm. And second 


laa 
ta 


is to design and implement an algorithm which reflects the engineering parameter 
approach and produces an all-pairs of R shortest path solution from which all SP; can 
be approximated efficiently. The goal is to produce the all-pairs of R solution quickly. 


from which the vehicle routing model can compute in one step any SP; ; 


C. THE PROPOSED ALGORITHMI 
1. Base Algorithm Selection 

A straightforward approach to solving this problem efficiently is to choose an 
algorithm which can produce shortest paths without necessarily examining everv arc in 
A and each node in N. The ultimate algorithm would examine only those nodes and 
arcs involved in the shortest paths for R. 

In this pursuit, 1t was decided to use a label setting algorithm as the base upon 
which to build. The most attractive aspect of a label setting method is that at each 
iteration, the permanent labels are optimal. That is, the shortest paths computed from 
the specified root are part of the final shortest path tree T=(N7.Ay7). even though T is 
momeconiplete until the |\|-! iteration. Label correcting, on the other hand. is not 
necessarily optimal until its last iteration. As well, an all-pairs algorithm (like Floyd's) 
is not optimal until all paths in the network have been examuned. By exploiting this 
optimality feature of label setting, it is hoped that an efficient. vet effective. algorithm 
can be developed. 

The reader will recall that in Chapter II, two label setting algorithms were 
discussed. The first was the basic label setting algorithm. while the second, improved 
label setting. used temporary labels which enabled a one-time examination of each arc 
vice repetitive exanunations. Both of these label setting techniques will be used as a 
base for the reference node aggregation algorithm design, and testing will provide for a 
comiparison between them. 

2. Termination Nleasures 

A means of exploiting the label setting algorithm for the problem at hand 
involves constructing the capability to force the termination of the algorithm before 
normal completion at the |N|-I’st iteration. The shortest path solution for a given 
reference root node is complete no later than the point where all non-root reference 
nodes are labeled, and thus when this occurs the algorithm can be stopped. This 
premature termination is acceptable due to the fact that the shortest paths are optimal 


at each iteration and that those shortest paths not identified by the time all reference 
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nodes are labeled have no direct impact on the all-pairs of R shortest path solutions. 
At worst, an alternate shortest path mav be ignored. 

In essence, this involves adding one step to the base label setting algorithm 
which checks to see if the label of each reference node 1s permanently set. If they are, 
the algorithm is stopped. On the other hand vif there 1s even sone selercice nocemien 
labeled, the algorithm proceeds as normal. 

The ternunation measure is designed to stop the algorithm from doing work 
that does not directly contribute to producing the desired shortest paths for the nodes 
in R. To assist in this effort, attention is now turned to another efficiency measure. 

3. Avoiding Recomputation of Shortest Paths 

Another measure to be added to the base algorithm also takes advantage of 
the nature of the label setting technique. As was mentioned earlier, solving a network 
for an all-pairs of R shortest path solution requires that the algorithm be run once for 
each reference node and that each reference node be designated as the root node for a 
specific run. 

With the exception of the first reference root node, this algorithm repetition 
can be exploited in that some shortest paths do not have to be computed again. Each 
repetition of the algorithm locates and labels the non-root reference nodes along some 
shortest path. Should a non-root reference node have any successors in a particular 
solution, these successors can be immediately labeled at the beginning of the iteration 
where the node is designated as the reference root node. This occurs, once again due 
to the fact that the label setting technique produces shortest paths at each iteration. 
Thus, any successor (node) to node i on a shortest path where 11s not the root node. 1s 
also a successor on that same path when node 1 1s the root node. 

So at each repetition of the algorithm, the previous shortest path solutions 
can be examined to determine if the new reference root mode had) successors im tiiese 
solutions. When this occurs, the successors can be inimediatelv labeled, thereby 
elinunating the computation of those shortest paths for the current iteration. 

4. Summary 

Two versions of the reference node aggregation algorithm are proposed. The 
first version utilizes the basic label setting algorithm (previously depicted in Figure 2.5 
of Chapter II) as an underlving structure and blends in the engineering parameter 
approach, as well as the ternunation measures and the measures for avoiding the 


recomputation of shortest paths. 
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Likewise, the second version blends in the parameter approach and these same 
measures, but differs in that it utilizes the improved label setting algorithm (previouslv 


depicted in Figure 2.6 of Chapter IT) as the underlving structure. 


D. THE DESIGNED ALGORITHMI 
1. Data Structure 

Supporting both versions of the proposed reference node aggregation 
algorithm is a data structure which is simular to that discussed in Chapter II. The 
network is represented internally by the TAIL array, the flowcost arrav C( ), and the 
Mead atray H{ ) (i.e., YO-NODE). As for the shortest path solutions, they are stored in 
the label and predecessor functions discussed previously. However, both functions are 
now defined as matrices of dimension |R| by |N] with each row containing the shortest 
path solution associated with the reference node used as the root node to generate that 
solution. 

To round out this data structure, three additional arravs are introduced. 

a. Reference Node Array 

To identifv which nodes in the network are designated as reference nodes, 
an array of length |R| is defined. This array, RF( ), simply contains as its elements the 
node number of each node belonging to R, that is. those nodes who have been 
designated reference nodes. 

b. Traversal and Depth Functions 

The final two arrays added to the data structure are of length |N|+1 and 
enable the identification of successors to non-root reference nodes in previous 
solutions. These well-known arrays are the traversal and depth functions [Ref. 2: page 
IS]. 

The traversal function IT( ) provides a means to keep track of the dynastic 
ordering of nodes in the shortest path tree T. This dynastic ordering produces a ring of 
nodes starting at the root with each entrv in IT( ) pointing to the next node in 
succession until the final value points back to the root. 

The depth function DP( ) keeps track of how many levels below the root 
node a non-root node is found in T. The root node is assigned a depth of zero. Those 
nodes directly attached to the root are assigned a depth of one and this level increases 
Piece mmeream@de cets Iron) the root. In essence. the depth of a node indicates the 
number of nodes (including the root) visited prior to reaching that non-root node along 


the shortest path. 


For the reference node aggregation algorithm, depth and traversal are used 
in conjunction with each other to identifv all the successors of a node. IT{i), the 
traversal value for node 1, points to the next node in the dynastic ordering and the 
depth of that next node specifies whether the node is a successor to node i or merely of 
lower order as compared to node 1. Iterating this, all the successors of node 1 can be 
identified as well as the shortest paths the successors are found on. These paths, thus 
identified, can be used in the current solution without the necessitv of computing from 
scratch. 

2. The Implemented Versions of the Proposed Algorithm 

Computer implementation of the proposed algorithm was accomplished in 
FORTRAN. As was indicated earlier, both the basic and improved label setting 
techniques Were used as base algorithms. 

Figure 3.1 depicts the reference node aggregation algorithm utilizing the 
improved label setting as its base. In this algorithm, a shortest path solution for each 
reference node is generated. Thus, step 1 initializes the solution index andthe 
associated reference root node is chosen in step 2. The predecessor, label, depth, and 
traversal functions are initialized for the current solution in step 3. Step 4 has two 
parts, and in part 4a, any SP; found in previous solutions {1.e., associated with another 
reference node) that has a length between EPI and EP2, is used to approximate S43 
where ris the current reference root node. Step 4b sets the label and predecessor of all 
nodes which were successors of the current reference root node in the previous 
solutions. In step 5, the reference nodes are examined at each opportunity where it is 
possible that each has been permanently labeled, and upon finding this to be true, 
forces the current iteration to halt and a new iteration (with a new reference root node) 
to start. The forward star of the last labeled node 1s examined in step 6 and all 
temporary labels which can be improved upon are updated. The best temporary label 
is located in step 7, while step 8 determines if the shortest path solution has reached 
the furthest distance (EP2) from the root node that the user wants examined. If this 
distance is met or exceeded, the current iteration is halted, and a new reference root 
node is designated. In step 9, the best temporary label is set permanent. Step 10 
increments the iteration index in preparation for selecting the new reference root node. 
And finally, in step 11 the all-pairs of R solution is complete and the algorithm stops. 

Figure 3.2 depicts the reference node aggregation algorithm using basic label 


setting as its base. In this version, a shortest path solution for each reference node 1s 


Input: 
(1) RF( ), the array of reference nodes. 
(Cie ehcOnaccata inetie torn ol the arravs C( ). H{ ). and I ). 
(yee breek PS 
Output: 
(1) PLY for 1 = 1. IRI 
(2) U0 for 1 = 1. RI 
. Initialize the repetition index: [| = 1 


miniitialize the tree | witht the reference root node: r = RF(I) 


GF BRI =— 


. Iteration initialization: 
pPlij) = 0, for allje N 
UG) = “©, foralljeN 
Ulin) = 0 
DP!j) = 0. for allj EN 
IT!(j) = 0, for allj EN 
pelyNj+1) = -1 
ITYIN|+1) = 4 
Thr = |NJ+1 
RFCNT = RI 
4. FOR K = 11-1 DO 
a. Examine the backpath of r. Letj = pK(r) 
WHILE j #0 DO 
IF EP1 < UX(r)- UK) < EP2 THEN 
neu) 


pli) = K+ |N| (denotes that the path from r to j is found 
in the shortest path solution for RF(kK)) 


j = PAG) 
ES DIF 
END WHILE 


Figure 3.1 Reference Node Aggregation Algorithm 
Using Improved Label Setting. 
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b. Exanmune the successors of r. Lets = IT). 
WHILE DPA(r) < DP(s) . DO: 
a) eer ema 
pl(j) = PNG) 


s = ITs). 
END) SUL 
ENDEDO 


5. IF Plij) > 0 for allj € R.j # RF(I) THEN go directly to step 10 
ELSE set RFCNT = number of unlabeled reference nodes 
6. Examine the forward star of node tr.” For each arc irom 1 to a snmodam 
Wines Ulin) pm @/ Jc, |) uy), Sem 
Ul) = Ul) + Cori) 
Phi) = -r 
7. DSMALL = © 
DO = LIN 
1F Pj) < 0 and Ul) < DSMALL THEN 
DSMALL = U4(j) 
=] 
ENDIF 
END D® 
8. IF Uk) > EP2 THEN for each j € N where P1(j) < 0 set: 
Pj) = -Pli) 


uly) = EP3 
gO directly to sien ie 
END 


9. Set Pl(k) = -P!(K) 
RFCNT = RFCNT-1 


IF RFCNT = 0 THEN goto step 5 
eS e200 step. 6 


Figure 3.1 Reference Node Aggregation Algorithm 
Using Improved Label Setting (cont’d.). 
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10... = I+ 1 


ieee Pini tepeat step 2 
Quit. 





Ireune 3. ll “Weference Node 


A ere gation Algorithm 
Using Improved Label Setti 


ng (eoint dy): 


0 


also generated. Steps | through 5 are identical to those for the reference aggregation 
algorithm using improved label setting. In step 6. the forward star of all labeled nodes 
is examined identifving those unlabeled nodes that are candidates to be labeled, while 
step / locates and labels the best candidate. Step 8 determines if the shortest path 
solution has reached the furthest distance (EP2) from the root node that the user wants 
exanuned. If this distance is met or exceeded, the current iteration is halted, and a new 
meverence root mode 1s designated. Step 9 determines if it 1S time to check to see if all 
the reference nodes have been labeled. The iteration index is incremented in step 10 in 
preparation for selecting the new reference root node. And finally, in step 11 the all- 


pairs of R solution 1s complete and the algorithm stops. 


Bee EXPERIMENTAL DESIGN AND TEST RESULTS 
1. Purpose 
Testing was designed to determine if the implemented algorithm worked 
properly and to enable comparison of the reference aggregation algorithm and the base 
label setting algorithms. Additionally, a brief examination of the data structure was 
conducted to determine if the data structure size met the stated problem restrictions. 
2. Design 
The algorithms were run on a set of sample problems generating execution 
times. A network of 1000 nodes was selected as the test case size with the number of 
arcs bound as stated in the assumptions. The test was designed as follows: 


e NETGEN was used to generate four separate networks of the following 
dimensions: 


Network A: 1000 nodes and 1750 arcs 
Network B: 1000 nodes and 1730 arcs 
Netw ork C: 1000 nodes and 3000 arcs 
Network D: 1000 nodes and 3000 arcs 


* Sihe PEND unction of FORTRAN was used to generate four sets of four 
reference nodes. 


(go 
Xo 


Input: 
(1) REC}, the array of reference neces: 
(2) Network data in the form of the arrays C( ). H( ), and T( ). 
(3) EP], ER 
Output: 
(1) PAY) for I = 1, IRI 
(2) UY) for I = 1, /R) 


I. Initialize the repetition index: I = 1 


~ 


2. Initialize the tree T with the felerence hootu node: + —a a) 
3. Iteration initialization: 
ply) = 0, for allje N 
Uj) = 2, for alljeN 
Ulir) = 0 
DPlj) = 0, for alljeN 
IThj) = 0, for alljeN 
DPlNj+1) = -1 
ITMIN|F = ¢ 
IT Mr) = |NI+1 
RFCNT = [RI 
4. FOR K = LI-1 DO 
a. Examine the backpath of r. Lety = P(r) 
WHILE jz0 DO 
IF EP1 < U(r) -U(j) < EP2 THEN 
LG) = ee) = ONY 


ply) = K+ |N] (denotes that the path from r to j 1s found 
in the shortest path solution for RF(K)) 


j = PAG) 
ENDIF 
END WHILE 


Figure 3.2 Reference Node Aggregation Algorithm 
Using Basic Label Setting. 
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b. Exanine the successors of r. Let s = ITK(r). 
WHILE DP(r) < DP‘). DO: 
Uljy = UK - UG) 
Pi(j) = PAG) 


S = ITK(s) 
END WHILE 
EXD DO 


5. IF Pj) > 0 for allj e R.j  RF(D) 
sees co directly to step 10 
Peseeset NPC NP) = number of unlabeled reference nodes 
6. Examine the forward star of all the labeled nodes and define: 
Smee) 1e Ne N-N7, (1j) € AY 
tievoee ) MELE N proceed directly to step 9. 
fExamine each element of § and: 
a. Find (k.l) = argmuin ful) to tiaiie o 4 
ees ts 
pli) = k 
Ula) = Ulex) + Crk) 
eed EP? THEN 


ees 
go directly to step 10 
EDI 


Jainmene — RECNPS 1 


Pane — 0 TEEN goto step:5 
eS coro step 6 


Ee 
Loy = i I 
eee S| liineeenepedtustep 2 
11. Quit. 


ee ee ae eee 


Figure 3.2 Reference Node Aggregation Algorithm 
Using Basic Label Setting (cont‘d.). 
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e Each algorithm was run against the four networks for each of the four sets of 
reference nodes. Table 3 identifies the test format. 


e The time each algorithm took to compute the shortest path solutions for each 
set of reference root nodes was recorded. 


e The execution results provide a complete block design which was be used in the 


non-parametric Friedman Test see page 299) which examines the hypotitesis 
that mean execution tumes for the various algorithms are identical. 


WADE o 
TEST FORMAT 


Sample Network Eng. Parameters 
B C D BPA ER 


ey 


O90 FM Exact 
olution 
Required) 





Algorithm used: 
1 Basic Label Setting Algorithm 
2 Improved Label Setting Algorithm 
3 Reference Aggregation Algorithm using Basic Label Setting 


4 Reference Aggregation Algorithm using Improved Label Setting 


3. Data Structure Size 
The implemented design for the sample problems meets the memory 
assumptions of the problem statement. Table 4 provides a summary of the data 
structure size for the reference node aggregation algorithm. It should be noted that the 
implemented reference node algorithm retains previous shortest path solutions in main 


memory (for speed). 
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TABLE 4 
Davies PRC CTURE REQUIREMENTS 


Label Setting Base Reference Aggregation 


Function 


Storing : 
Network 2 Alea APN 
ata 


Shortest 
Path IR + 2[|N| 
Generation 


Storing : 
SHortest ZINN | 
Pathe 

Solutions 





Total 2)A|+3|N| 2A] +3/N]+ 2/RIIN|+ [RI 


4. Algorithm Execution Time Comparison 

Four tests were conducted to enable a comparison of algorithm execution 
times for the sample networks. Test | was designed to provide sample execution times 
for the base algorithms, namely basic and improved label setting. Tests 2, 3, and 4 
were designed to provide sample execution times for both versions of the reference 
node algorithm using differing values of the engineering parameters. Test 2 uses 
engineering parameters that for the particular networks involved can be considered as 
infinite values since no path approached the specified length. Thus, test 2 essentially 
exanunes the reference node aggregation algorithm where the engineering parameters 
have no impact on the shortest path solutions. Tests 3 and 4 use engineering 
parameter values that restrict shortest path solutions subject to the designed rules. 

The data produced by Tests | through 4 consisted of the accumulated time it 
took each algorithm to solve the shortest path for a given set of four reference nodes. 


Thus, each test produced 32 data points. 


The Friedman Test [Ref. 9: page 299] is a non-parametric test which makes no 
distributional assumptions. It utilizes a randomized complete block design to test the 
null hypothesis that treatment effects are equal, with the alternate hvpothesis that at 
least two effects are not equal. To this end, the algorithms were considered treatments, 
while the reference set/sample network pairs were blocks, Ihusythe randomizes 
complete block design consists of eight treatments and sixteen blocks. Figure 3.3 
presents the data in the randonuzed complete block format utilized for the Friedman 
Test. In this case, the null hypothesis translates to that the mean time of Gxecutiaumre 
produce a shortest path solution for a given network and reference node set is the same 
regardless of the algorithm used. The alternate then becomes that at least two of the 
algorithm implementations have different mean execution times. 

Utilizing an @-level of 0.05, the Friedman test statistic was computed giving 
T2 = 113.6. An F-statistic with (7,105) degrees of freedom approximates T2. With 
F(7,105) for @=0.05 equal to 2.109, the null hypothesis was rejected enabling use of 
the multiple comparison extension of the Friedman Test [Ref. 9: page 297]. This 
multiple comparison showed, for the sample networks, that none of the treatment 
effects were equal statistically. Thus, the implemented algorithm is a robust one. 

Further, this comparison indicated that for the sample problems, the reference 
node aggregation algorithm utilizing the improved label setting base outperformed (1.e., 
was faster) that version which used the basic label setting as a base. Table 5 provides a 
sunimary of the algorithm performance for the sample networks. 

5. Conclusions 

The design and implementation of the reference node aggregation algorithm 
has been successfully accomplished. In addition, the testing showed that: 

e Both label setting and improved label setting served as an adequate base for the 
algorithm implementation, and the resulting reference node aggregation 
algorithm functioned as planned. 


e The performance of the improved label setting base was superior to that, of the 
basic label setting as implemented in the reference node aggregation algorithn. 


e The engineering Raaemele approach is a robust one and demonstrated its 


ability to enable the user to ale the reference. node aggregation algorithm 
with respect to the scope of the shortest path solutions produced. 
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Hypothesis: 


Ho : The treatments have identical effects within a block. 


H) : At least one treatment tends to yield larger observed 
values than another treatment. 


Note: Each data entry is the execution time in CPU seconds for an algorithm 
(1.e., treatment) to solve the all-pairs of R shortest path problem with IRI = 4 
reference nodes, given networks of dimensions specified in the block definition 
below. 





Treatment 
1 2 3 G 5 6 7 8 
Block 
1 27a 11.13 15.31 10.87 9.12 5.12 8.51 0.44 
2 27.43 10.97 13.15 10.67 12.24 5.93 5.93 0.39 
3 27.74 Li-02 20.20 10.70 15.43 3.55 15.10 0.21 
G 27.70 11.01 13.76 10.84 17.61 3.28 17.79 0.25 
5 27.91 11.14 18.49 11.20 17.96 5.57 20.00 0.56 
6 26.70 10.98 Lew le P35 i. Sl 3.91 17.99 0.26 
a 26.52 11.09 22.23 10.98 14.95 5.28 17.00 0.51 
8 26.73 10.83 11.31 11.08 11.36 6.07 5.41 0.38 
9 38.59 12.12 28.58 11.65 28.83 11.75 21.47 6.36 
10 38.47 11.96 30.92 11.64 31.11 11.70 23.42 8.89 
ll 38.86 11.86 27.18 11.62 27.70 2) 537 18.78 9.42 
ik 37.89 11.81 25.21 11.54 23.16 11.67 19.69 Tati 
13 37.81 11.97 19.29 11.90 18.67 11.85 20.95 6.47 
14 38.10 11.91 15.12 11.85 14.58 11.76 14.46 8.81 
15 38.29 11.98 15.07 11.78 14.28 11.55 14.69 9.14 
16 38.62 11.83 28.03 le? 9 27.04 11.46 18.70 5.32 


Treatment Definitions: 
1 Basic Label Setting Algorithm 
Improved Label Setting Algorithm 


Reference Node Aggregation Algorithm using Basic 
Label Setting with EP1=9991, EP2=9992, and EP3=9999. 


Reference Node Aggregation Algorithm using Improved 
Label Setting with EP1=9991, EP2=9992, and EP3=9999. 


Reference Node Aggregation Algorithm using Basic 
Label Setting with EP1=50, EP2=60, and EP3=100. 


Reference Node Aggregation Algorithm using Improved 
Label Setting with EP1=50, EP2=60, EP3=100. 





Figure 3.3. Randonuzed Complete Block Design for the Friedman Test. 
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Treatment Definition (con't): 


7 Reference Node Aggregation Algorithm using Basic 
Label Setting with EP1=25, EP2=30, and EP3=50. 


Reference Node Aggregation Algorithm using Improved 
Label Setting with EP1=25, EP2=30, and EP3=50. 


Block Definitions: 


Figiives 


Network 
Network 
Network 
Network 
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Network 
Network 
Network 
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Network 
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Network 
Ne twork 
Network 
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Randomized Complete Block Design for the Friedman Test. (cont‘d.) 
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(CPU SFERE: 


Network A & B Network C & D 
Sample Sample Sample Sample 


Mean Standard Mean Standard 
Deviation Deviation 


Treatment 


Treatment: 
Basic Label Setting Algorithm 
Improved Label Setting Algorithm 


Reference Node Aggregation Algorithm using Basic 
Label Setting with EP1=9991, EP2=9992, and EP3=9999. 


Reference Node Aggregation Algorithm using Improved 
Label Setting with EP1=9991, EP2=9992, and EP3=9999. 


Reference Node Aggregation Algorithm using Basic 
Label Setting with EP1=50, EP2=60, and EP3=100. 


Reference Node Aggregation Algorithm using Improved 
Label Setting with EP1=50, EP2=60, EP3=100. 


Reference Node Aggregation Algorithm using Basic 
Label Setting with EP1=25, EP2=30, and EP3=50. 


Reference Node Aggregation Algorithm using Improved 
Label Setting with EP1=25, EP2=30, and EP3=50. 





47 


LV. CONCLUSIONS 


A. SOTACA 

Chapter II presented two problems that have arisen in the use of the OJCS 
contingency planning model SOTACA and proposed some means of resolving them. 
SOTACA uses an implementation of Floyd’s algorithm to compute an all-pairs 
shortest path solution and experiences slow execution (1.e., upwards of twenty minutes 
on a MICRO VAN) when dealing with networks at or near the maximum allowable 
size Which 1s very small by contemporary standards. The author has proposed that this 
first problem be resolved by modifving SOTACA so that a single-source shortest path 
algorithm, label setting or label correcting. is used to produce shortest path solutions 
on demand vice the current method preprocessing all pairs. Accompanying this 
algorithm change, it has been proposed that the SOTACA network representation be 
changed to a forward star format because of the gained efficiencies. Testing showed 
that these algorithms and the forward star network representation produced shortest 
path solutions very quicklv even for networks at the maximum allowable size. With 
these changes, it is anticipated that SOTACA’s slow execution problem will be 
resolved. However, the reader should be aware that other implementations of the 
single-source shortest path algorithms exist in the literature available, and could also be 
applied to this time problem. 

As for SOTACA’s second problem, that of nonrecognition of alternate shortest 
paths, the use of a depth-first search (and the modified label setting algorithm) has 
been shown to correctly locate and enunierate alternate shortest paths. However, there 
are several research and implementation issues which this thesis has only touched 
upon. Some of these issues that the author feels should be examined are: 


(1) the impact on SOTACA of the additional code and data structure required to 
imiplenient a means for enumerating alternate shortest paths, 


(2) Ce ellis of this added capability with respect to model execution (1.e., time). 
an 


(3) a comparison (speed, data structure size. source code size... .) of the depth- 
first search versus the breadth-first search, or any other methodologies for 
enumerating alternate shortest paths. 

Nonetheless, what has been shown is that the alternate shortest path problem can be 


resolved and that methods to address it are readilv available. 


48 


B. REFERENCE NODE AGGREGATION 

In Chapter III. a new algorithm, reference node aggregation. was proposed. This 
algorithm is designed to produce a subset of the all-pairs shortest path solution for 
large scale networks. This subset solution, an all-pairs of R solution. is specifically 
structured and is intended to support a vehicle routing model by providing the means 
by which it can quickly compute (1.e.. in one step) the approximate cost of the shortest 
path between any two nodes in the network. Additionally, the algorithm provides for 
user-specification of three engineering parameters. These parameters can be used to 
make tradeoffs between the accuracy of the all-pairs of R solution and the total time it 
takes to produce it. 

The algorithm was implemented in two forms, one using a basic label setting 
methodology, and the other using an improved (one look per arc) label setting 
methodology. Testing demonstrated that both implementations were successful and 
that the improved label setting methodology was superior as its production of the 
subset solution was significantly faster. Also, the engineering parameter approach 
proves to be flexible and enables the user to adjust the the algorithm to fit the 
individual problem as well as providing a means to take advantage of computer 
hardware improvements without modifving the algorithm. 

However, the author feels that the reference node aggregation algorithm, as 
presented in this thesis. can be improved upon. Specifically, the following are potential 
areas of improvement: 

(1) Determining which and how many of the previous solutions to exanune when 
looking for successors to a reference root node. The current implementation 
exanunes them sequentially. 


(2) Determining which and how many of the previous solutions to examine when 
approximating a SP:- by SP: 


ia Pi 
(3) Reorganizing the sequence of the steps so that the algorithm 1s provided some 
room to run” before it starts expending effort to check on whether all 
reference nodes are labeled, or whether the labeling has reached the maximum 
distance from the root to be examined. 

The main point is that this thesis has concentrated on showing that the reference node 
aggregation concept (with its engineering parameter approach) works, and that 

additional analvsis could make improvements to the functioning of the algorithm. 
Beyond improvements, the next step (though not part of this thesis) is the major 
one which involves embedding the reference node aggregation algorithm in a vehicle 
routing model and then assessing its performance. In this way, the validity of the 
reference node aggregation algorithm with its engineering parameter approach can be 


shown. 
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